Skip to content

Conversation

c-rhodes
Copy link
Collaborator

@c-rhodes c-rhodes commented Sep 9, 2025

#157112 adds a test '@fold_phi_noundef_start_value' where we can't currently remove the freeze, even though it's possible. The two places in instcombine that deal with freezes are
'pushFreezeToPreventPoisonFromPropagating' and
'foldFreezeIntoRecurrence'. The former doesn't support PHIs at all and the latter is restricted to recurrences. It doesn't consider this test a recurrence as the BB of the frozen PHI node '%loop.latch' does not dominate either of the BBs ('%loop', '%if.else') of the incoming values.

Therefore, I've updated 'pushFreezeToPreventPoisonFromPropagating' to support pushing the freeze to the incoming PHI values, as long as the BB of the frozen PHI does not dominate the BB of the incoming value(s). This fixes the test case added in #157112 and the other test changes look sensible, although perhaps I'm missing some edge cases (?).

The 'match(U.get(), m_Undef())' check for undef PHI incoming value looks necessary to catch the case covered by '@two_undef' test in freeze-phi.ll. Without this check the undef becomes zero which doesnt seem right.

@c-rhodes c-rhodes requested a review from nikic September 9, 2025 14:12
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Sep 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Cullen Rhodes (c-rhodes)

Changes

#157112 adds a test '@fold_phi_noundef_start_value' where we can't currently remove the freeze, even though it's possible. The two places in instcombine that deal with freezes are
'pushFreezeToPreventPoisonFromPropagating' and
'foldFreezeIntoRecurrence'. The former doesn't support PHIs at all and the latter is restricted to recurrences. It doesn't consider this test a recurrence as the BB of the frozen PHI node '%loop.latch' does not dominate either of the BBs ('%loop', '%if.else') of the incoming values.

Therefore, I've updated 'pushFreezeToPreventPoisonFromPropagating' to support pushing the freeze to the incoming PHI values, as long as the BB of the frozen PHI does not dominate the BB of the incoming value(s). This fixes the test case added in #157112 and the other test changes look sensible, although perhaps I'm missing some edge cases (?).

The 'match(U.get(), m_Undef())' check for undef PHI incoming value looks necessary to catch the case covered by '@two_undef' test in freeze-phi.ll. Without this check the undef becomes zero which doesnt seem right.


Full diff: https://github.com/llvm/llvm-project/pull/157678.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+14-3)
  • (modified) llvm/test/Transforms/InstCombine/freeze-landingpad.ll (+2-3)
  • (modified) llvm/test/Transforms/InstCombine/freeze-phi.ll (+4-3)
  • (modified) llvm/test/Transforms/InstCombine/freeze.ll (+11-12)
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index a74f292524b4d..4c562c5500de1 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -5042,10 +5042,18 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
   //   Op1.fr = Freeze(Op1)
   //   ... = Inst(Op1.fr, NonPoisonOps...)
 
-  auto CanPushFreeze = [](Value *V) {
-    if (!isa<Instruction>(V) || isa<PHINode>(V))
+  auto CanPushFreeze = [this](Value *V) {
+    if (!isa<Instruction>(V))
       return false;
 
+    if (auto *PN = dyn_cast<PHINode>(V)) {
+      if (llvm::any_of(PN->incoming_values(), [this, &PN](Use &U) {
+            return DT.dominates(PN->getParent(), PN->getIncomingBlock(U)) ||
+                   match(U.get(), m_Undef());
+          }))
+        return false;
+    }
+
     // We can't push the freeze through an instruction which can itself create
     // poison.  If the only source of new poison is flags, we can simply
     // strip them (since we know the only use is the freeze and nothing can
@@ -5070,7 +5078,10 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
         return nullptr;
 
       auto *UserI = cast<Instruction>(U->getUser());
-      Builder.SetInsertPoint(UserI);
+      if (auto *PN = dyn_cast<PHINode>(UserI))
+        Builder.SetInsertPoint(PN->getIncomingBlock(*U)->getTerminator());
+      else
+        Builder.SetInsertPoint(UserI);
       Value *Frozen = Builder.CreateFreeze(V, V->getName() + ".fr");
       U->set(Frozen);
       continue;
diff --git a/llvm/test/Transforms/InstCombine/freeze-landingpad.ll b/llvm/test/Transforms/InstCombine/freeze-landingpad.ll
index 29167958c857f..259808c046490 100644
--- a/llvm/test/Transforms/InstCombine/freeze-landingpad.ll
+++ b/llvm/test/Transforms/InstCombine/freeze-landingpad.ll
@@ -14,14 +14,13 @@ define i32 @propagate_freeze_in_landingpad() personality ptr null {
 ; CHECK-NEXT:    [[RES1:%.*]] = invoke i32 @foo()
 ; CHECK-NEXT:            to label [[NORMAL_RETURN]] unwind label [[EXCEPTIONAL_RETURN]]
 ; CHECK:       normal_return:
-; CHECK-NEXT:    [[INC]] = add nuw nsw i32 [[X]], 1
+; CHECK-NEXT:    [[INC]] = add i32 [[X]], 1
 ; CHECK-NEXT:    br label [[INVOKE_BB1]]
 ; CHECK:       exceptional_return:
 ; CHECK-NEXT:    [[PHI:%.*]] = phi i32 [ [[X]], [[INVOKE_BB1]] ], [ 0, [[INVOKE_BB2]] ]
 ; CHECK-NEXT:    [[LANDING_PAD:%.*]] = landingpad { ptr, i32 }
 ; CHECK-NEXT:            cleanup
-; CHECK-NEXT:    [[FR:%.*]] = freeze i32 [[PHI]]
-; CHECK-NEXT:    [[RES:%.*]] = shl i32 [[FR]], 1
+; CHECK-NEXT:    [[RES:%.*]] = shl i32 [[PHI]], 1
 ; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
diff --git a/llvm/test/Transforms/InstCombine/freeze-phi.ll b/llvm/test/Transforms/InstCombine/freeze-phi.ll
index cdc9a5efe5933..098e6c5f6cdbb 100644
--- a/llvm/test/Transforms/InstCombine/freeze-phi.ll
+++ b/llvm/test/Transforms/InstCombine/freeze-phi.ll
@@ -94,13 +94,14 @@ define i32 @two(i1 %cond, i32 %x, i32 %x2) {
 ; CHECK-LABEL: @two(
 ; CHECK-NEXT:    br i1 [[COND:%.*]], label [[A:%.*]], label [[B:%.*]]
 ; CHECK:       A:
+; CHECK-NEXT:    [[X_FR:%.*]] = freeze i32 [[X:%.*]]
 ; CHECK-NEXT:    br label [[C:%.*]]
 ; CHECK:       B:
+; CHECK-NEXT:    [[X2_FR:%.*]] = freeze i32 [[X2:%.*]]
 ; CHECK-NEXT:    br label [[C]]
 ; CHECK:       C:
-; CHECK-NEXT:    [[Y:%.*]] = phi i32 [ [[X:%.*]], [[A]] ], [ [[X2:%.*]], [[B]] ]
-; CHECK-NEXT:    [[Y_FR:%.*]] = freeze i32 [[Y]]
-; CHECK-NEXT:    ret i32 [[Y_FR]]
+; CHECK-NEXT:    [[Y:%.*]] = phi i32 [ [[X_FR]], [[A]] ], [ [[X2_FR]], [[B]] ]
+; CHECK-NEXT:    ret i32 [[Y]]
 ;
   br i1 %cond, label %A, label %B
 A:
diff --git a/llvm/test/Transforms/InstCombine/freeze.ll b/llvm/test/Transforms/InstCombine/freeze.ll
index af5cb0c75537b..91e89b3b5a332 100644
--- a/llvm/test/Transforms/InstCombine/freeze.ll
+++ b/llvm/test/Transforms/InstCombine/freeze.ll
@@ -269,15 +269,15 @@ define void @freeze_dominated_uses_catchswitch(i1 %c, i32 %x) personality ptr @_
 ; CHECK-NEXT:    invoke void @use_i32(i32 0)
 ; CHECK-NEXT:            to label %[[CLEANUP:.*]] unwind label %[[CATCH_DISPATCH:.*]]
 ; CHECK:       [[IF_ELSE]]:
+; CHECK-NEXT:    [[X_FR:%.*]] = freeze i32 [[X]]
 ; CHECK-NEXT:    invoke void @use_i32(i32 1)
 ; CHECK-NEXT:            to label %[[CLEANUP]] unwind label %[[CATCH_DISPATCH]]
 ; CHECK:       [[CATCH_DISPATCH]]:
-; CHECK-NEXT:    [[PHI:%.*]] = phi i32 [ 0, %[[IF_THEN]] ], [ [[X]], %[[IF_ELSE]] ]
+; CHECK-NEXT:    [[PHI:%.*]] = phi i32 [ 0, %[[IF_THEN]] ], [ [[X_FR]], %[[IF_ELSE]] ]
 ; CHECK-NEXT:    [[CS:%.*]] = catchswitch within none [label %[[CATCH:.*]], label %catch2] unwind to caller
 ; CHECK:       [[CATCH]]:
 ; CHECK-NEXT:    [[CP:%.*]] = catchpad within [[CS]] [ptr null, i32 64, ptr null]
-; CHECK-NEXT:    [[PHI_FREEZE:%.*]] = freeze i32 [[PHI]]
-; CHECK-NEXT:    call void @use_i32(i32 [[PHI_FREEZE]]) [ "funclet"(token [[CP]]) ]
+; CHECK-NEXT:    call void @use_i32(i32 [[PHI]]) [ "funclet"(token [[CP]]) ]
 ; CHECK-NEXT:    unreachable
 ; CHECK:       [[CATCH2:.*:]]
 ; CHECK-NEXT:    [[CP2:%.*]] = catchpad within [[CS]] [ptr null, i32 64, ptr null]
@@ -384,15 +384,16 @@ define i32 @freeze_phi_followed_by_phi(i1 %c, i32 %y, i32 %z) {
 ; CHECK-LABEL: define i32 @freeze_phi_followed_by_phi(
 ; CHECK-SAME: i1 [[C:%.*]], i32 [[Y:%.*]], i32 [[Z:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    [[Z_FR:%.*]] = freeze i32 [[Z]]
+; CHECK-NEXT:    [[Y_FR:%.*]] = freeze i32 [[Y]]
 ; CHECK-NEXT:    br i1 [[C]], label %[[IF:.*]], label %[[JOIN:.*]]
 ; CHECK:       [[IF]]:
 ; CHECK-NEXT:    br label %[[JOIN]]
 ; CHECK:       [[JOIN]]:
-; CHECK-NEXT:    [[X:%.*]] = phi i32 [ [[Y]], %[[IF]] ], [ [[Z]], %[[ENTRY]] ]
-; CHECK-NEXT:    [[PHI:%.*]] = phi i32 [ [[Z]], %[[IF]] ], [ [[Y]], %[[ENTRY]] ]
-; CHECK-NEXT:    [[FR:%.*]] = freeze i32 [[X]]
-; CHECK-NEXT:    call void @use_i32(i32 [[FR]])
-; CHECK-NEXT:    call void @use_i32(i32 [[FR]])
+; CHECK-NEXT:    [[X:%.*]] = phi i32 [ [[Y_FR]], %[[IF]] ], [ [[Z_FR]], %[[ENTRY]] ]
+; CHECK-NEXT:    [[PHI:%.*]] = phi i32 [ [[Z_FR]], %[[IF]] ], [ [[Y_FR]], %[[ENTRY]] ]
+; CHECK-NEXT:    call void @use_i32(i32 [[X]])
+; CHECK-NEXT:    call void @use_i32(i32 [[X]])
 ; CHECK-NEXT:    ret i32 [[PHI]]
 ;
 entry:
@@ -1141,8 +1142,7 @@ exit:
 }
 
 ; We can remove this freeze as the incoming values to the PHI have the same
-; well-defined start value and the GEP can't produce poison, but this is
-; currently unsupported.
+; well-defined start value and the GEP can't produce poison.
 define void @fold_phi_noundef_start_value(ptr noundef %init, i1 %cond.0, i1 %cond.1, i1 %cond.2) {
 ; CHECK-LABEL: define void @fold_phi_noundef_start_value(
 ; CHECK-SAME: ptr noundef [[INIT:%.*]], i1 [[COND_0:%.*]], i1 [[COND_1:%.*]], i1 [[COND_2:%.*]]) {
@@ -1156,8 +1156,7 @@ define void @fold_phi_noundef_start_value(ptr noundef %init, i1 %cond.0, i1 %con
 ; CHECK-NEXT:    br label %[[LOOP_LATCH]]
 ; CHECK:       [[LOOP_LATCH]]:
 ; CHECK-NEXT:    [[IV_2:%.*]] = phi ptr [ [[IV_0]], %[[LOOP]] ], [ [[IV_1]], %[[IF_ELSE]] ]
-; CHECK-NEXT:    [[IV_2_FR:%.*]] = freeze ptr [[IV_2]]
-; CHECK-NEXT:    [[IV_2_FR_INT:%.*]] = ptrtoint ptr [[IV_2_FR]] to i64
+; CHECK-NEXT:    [[IV_2_FR_INT:%.*]] = ptrtoint ptr [[IV_2]] to i64
 ; CHECK-NEXT:    [[IV_0_INT:%.*]] = ptrtoint ptr [[IV_0]] to i64
 ; CHECK-NEXT:    [[IDX:%.*]] = sub i64 [[IV_0_INT]], [[IV_2_FR_INT]]
 ; CHECK-NEXT:    [[IV_0_NEXT]] = getelementptr i8, ptr [[IV_0]], i64 [[IDX]]

return false;

if (auto *PN = dyn_cast<PHINode>(V)) {
if (llvm::any_of(PN->incoming_values(), [this, &PN](Use &U) {
return DT.dominates(PN->getParent(), PN->getIncomingBlock(U)) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check doesn't work for irreducible cycles. But we have an isBackEdge() helper in InstCombine for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a check for isBackEdge(). Although unlike the other edge cases you pointed out I couldn't write a test case that triggered an issue for this one.

Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/lib/Transforms/InstCombine/InstructionCombining.cpp llvm/test/Transforms/InstCombine/freeze-landingpad.ll llvm/test/Transforms/InstCombine/freeze-phi.ll llvm/test/Transforms/InstCombine/freeze.ll

The following files introduce new uses of undef:

  • llvm/test/Transforms/InstCombine/freeze.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@c-rhodes
Copy link
Collaborator Author

ping

@c-rhodes c-rhodes requested a review from david-arm September 22, 2025 08:39
@nikic
Copy link
Contributor

nikic commented Sep 22, 2025

@zyw-bot csmith-fuzz

// invalidating the iterator. We simply don't support this case, but it
// could be handled if there's a use case.
if (isBackEdge(InBB, BB) || !VisitedBBs.insert(InBB).second ||
match(U.get(), m_Undef()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in theory we could push the freeze to an undef argument, right? I noticed that this IR:

define i64 @foo() {
entry:
  %res = freeze i64 undef
  ret i64 %res
}

gets optimised by instcombine to just

define i64 @foo() {
entry:
  ret i64 0
}

but this could be done in a later patch if we see a need for it.

Copy link
Collaborator Author

@c-rhodes c-rhodes Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without this check freeze is pushed to %x in this test:

define i32 @two_undef(i8 %cond, i32 %x) {
which I don't think is right

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing it through a freeze argument is correct, it's just likely to be non-profitable right now. I think it makes sense to skip that case.

@@ -5070,7 +5093,10 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
return nullptr;

auto *UserI = cast<Instruction>(U->getUser());
Builder.SetInsertPoint(UserI);
if (auto *PN = dyn_cast<PHINode>(UserI))
Builder.SetInsertPoint(PN->getIncomingBlock(*U)->getTerminator());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure why this change is needed. Before this PR CanPushFreeze always returned false for PHI nodes and the existing Builder.SetInsertPoint(UserI); seemed to work perfectly fine. All that happens now is that we sometimes return false if it's a PHI node.

Is this fixing an existing bug?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the existing Builder.SetInsertPoint(UserI); hasn't been removed

Copy link
Contributor

@david-arm david-arm Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I'm missing something, but before your patch CanPushFreeze always returned false for PHI nodes which meant we were calling Builder.SetInsertPoint(UserI); for PHI nodes.

However, even in places where we still return false for PHI nodes same as before we're now calling Builder.SetInsertPoint(PN->getIncomingBlock(*U)->getTerminator());. For example, it looks like you've changed the insertion point for PHI nodes with undef as incoming value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that CanPushFreeze() is called on the use, while this is checking for the user being a phi node. Previously we could not have ended up with a phi user, as that would imply that we have pushed through a phi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok I think I understand now. So basically if CanPushFreeze returns true for a PHI node we fall through to code below that adds the operands of the PHI to the worklist. Then if CanPushFreeze returns false for one of those operands we'll end up asking for the user, which will be the PHI.

PR llvm#157112 adds a test '@fold_phi_noundef_start_value' where we can't
currently remove the freeze, even though it's possible. The two places
in instcombine that deal with freezes are
'pushFreezeToPreventPoisonFromPropagating' and
'foldFreezeIntoRecurrence'.  The former doesn't support PHIs at all and
the latter is restricted to recurrences. It doesn't consider this test a
recurrence as the BB of the frozen PHI node '%loop.latch' does not
dominate either of the BBs ('%loop', '%if.else') of the incoming values.

Therefore, I've updated 'pushFreezeToPreventPoisonFromPropagating' to
support pushing the freeze to the incoming PHI values, as long as the BB
of the frozen PHI *does not* dominate the BB of the incoming value(s).
This fixes the test case added in llvm#157112 and the other test changes
look sensible, although perhaps I'm missing some edge cases (?).

The 'match(U.get(), m_Undef())' check for undef PHI incoming value looks
necessary to catch the case covered by '@two_undef' test in
freeze-phi.ll. Without this check the undef becomes zero which doesnt
seem right.
- remove unused %cond.2 from tests
- bail on backedge from BB of incoming value of phi to BB of freeze
- bail if incoming value of phi is from invoke and add a test
- bail if phi has multiple values from same predecessor and test
This reverts commit 790e7ede98c32b4a8c2f0fa69a2817edfc63e475.
@c-rhodes
Copy link
Collaborator Author

Whilst doing some further testing I found 527.cam4_r in SPEC2017 is hanging with this patch. I see:

OrigUse->get() = <2 x i1> poison
OrigFI:   %.fr91682 = freeze <2 x i1> poison
OrigUse->get() = <2 x i1> poison
OrigFI:   %188 = freeze <2 x i1> %broadcast.splat2369
OrigUse->get() =   %broadcast.splat2369 = shufflevector <2 x i1> %broadcast.splatinsert2368, <2 x i1> poison, <2 x i32> zeroinitializer

is repeated when dumping OrigFI/OrigUse in pushFreezeToPreventPoisonFromPropagating, so I added check:

index 6004feb1d885..be983ed63daf 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -5026,6 +5026,9 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
       if (U == OrigUse)
         return nullptr;

+      if (match(V, m_Poison()))
+        return nullptr;
+
       auto *UserI = cast<Instruction>(U->getUser());
       if (auto *PN = dyn_cast<PHINode>(UserI))
         Builder.SetInsertPoint(PN->getIncomingBlock(*U)->getTerminator());

which fixes it, but still trying to write a test case. I did isolate the function in cam4 and attempt to reduce it will llvm-reduce, but just get testcases that take exactly N seconds to compile.

I also collected data on the # of freezes in SPEC2017 (Ofast, LTO) after this patch and there's a ~13% increase. #154336 also caused an increase of ~27%. Although as previously mentioned #154336 (comment), I'm not convinced this is a particularly useful metric, but just wanted to share this.

@c-rhodes c-rhodes force-pushed the instcombine-phi-gep-same-noundef-source branch from e1ef7e0 to 05d8a51 Compare September 30, 2025 12:09
@c-rhodes
Copy link
Collaborator Author

I've rebased and added a fix for the compile-time issue in spec2017 527.cam4_r, although I'm unable to write a test for it. This IR hits the fix I've added: https://godbolt.org/z/KrTcWroqr, initially I bailed out here but it caused a regression, so I changed it to not freeze the poison value instead.

@nikic
Copy link
Contributor

nikic commented Sep 30, 2025

I don't understand the fix you have applied. Doesn't this mean that we will now fail to freeze direct poison operands?

Can you provide some more context on what causes the compile-time issue?

@nikic
Copy link
Contributor

nikic commented Sep 30, 2025

I also collected data on the # of freezes in SPEC2017 (Ofast, LTO) after this patch and there's a ~13% increase. #154336 also caused an increase of ~27%. Although as previously mentioned #154336 (comment), I'm not convinced this is a particularly useful metric, but just wanted to share this.

The tradeoff here is basically between having less freeze or having more freeze in positions where it has less impact. That the total number of freezes increases is expected.

@c-rhodes
Copy link
Collaborator Author

I don't understand the fix you have applied. Doesn't this mean that we will now fail to freeze direct poison operands?

Can you provide some more context on what causes the compile-time issue?

Sorry I admit it's not ideal, but I'm at a bit of a loss as to what to do about it. I'm seeing lots of freezes being created on direct poison operands, e.g. OrigFI: %.fr91682 = freeze <2 x i1> poison (notice the number in the var name). I thought perhaps freeze is being pushed around the loop, I've tried to reduce it and found it reduces to a testcase that takes as many seconds to compile as the timeout. The best I've managed to get is two tests, one that takes 5 seconds to compile and one 10 seconds:

  1. https://gist.github.com/c-rhodes/37911b2b01fe450e23f9336a736193db
  2. https://gist.github.com/c-rhodes/080fa33a2ffe1ed3b82beb5557b95b92

and the 10 second one only has 31 more lines than the 5 second one and looking at the diff quite a few more PHIs: https://gist.github.com/c-rhodes/9042f77580a24df6dfcd849b55d5de64

but I'm struggling to isolate the specific issue or create a small reproducer. The previous godbolt link i shared is from llvm-reduce, but the loops and control-flow is gone.

@c-rhodes
Copy link
Collaborator Author

c-rhodes commented Oct 6, 2025

Sharing some findings:

  • There is no hang when disabling both foldOpIntoPhi and foldFreezeIntoRecurrence
    • this suggests these patterns are conflicting with pushFreezeToPreventPoisonFromPropagating
    • both patterns must be disabled, which suggests whatever the problem is exists in both
  • ~13% of the debug output is poison values (from 5s testcase):
    520544 freeze <2 x double> poison
    267968 freeze <2 x i1> poison
    456696 freeze <4 x double> poison
    
  • these comes from splats that are part of a wider vector reduction:
    1978 IC: pushFreezeToPreventPoisonFromPropagating ENTER  worklist=764 deferred=0 freeze=  %133 = freeze i1 %132
    1979 ADD DEFERRED:   %132 = icmp ne i4 %131, 0
    1980 ADD DEFERRED:   %131 = bitcast <4 x i1> %bin.rdx3030 to i4
    1981 ADD DEFERRED:   %bin.rdx3030 = or <4 x i1> %129, %bin.rdx3029
    1982 ADD DEFERRED:   %129 = or <4 x i1> %vec.phi3021, %125
    1983 ADD DEFERRED:   %vec.phi3021.fr = freeze <4 x i1> %vec.phi3021
    1984 ADD DEFERRED:   %125 = fcmp reassoc nsz arcp contract afn olt <4 x double> %wide.load3025, %broadcast.splat3015
    1985 ADD DEFERRED:   %wide.load3025.fr = freeze <4 x double> %wide.load3025
    1986 ADD DEFERRED:   %broadcast.splat3015 = shufflevector <4 x double> %broadcast.splatinsert3014, <4 x double> poison, <4 x i32> zeroinitializer
    1987 ADD DEFERRED:   %broadcast.splatinsert3014 = insertelement <4 x double> poison, double %108, i64 0
    1988 ADD DEFERRED:   %.fr = freeze <4 x double> poison
    1989 ADD DEFERRED:   %.fr1 = freeze double %108
    1990 ADD DEFERRED:   %.fr2 = freeze <4 x double> poison
    1991 ADD DEFERRED:   %bin.rdx3029 = or <4 x i1> %128, %bin.rdx3028
    1992 ADD DEFERRED:   %128 = or <4 x i1> %vec.phi3020, %124
    1993 ADD DEFERRED:   %vec.phi3020.fr = freeze <4 x i1> %vec.phi3020
    1994 ADD DEFERRED:   %124 = fcmp reassoc nsz arcp contract afn olt <4 x double> %wide.load3024, %broadcast.splat3015
    1995 ADD DEFERRED:   %wide.load3024.fr = freeze <4 x double> %wide.load3024
    1996 ADD DEFERRED:   %bin.rdx3028 = or <4 x i1> %127, %126
    1997 ADD DEFERRED:   %127 = or <4 x i1> %vec.phi3019, %123
    1998 ADD DEFERRED:   %vec.phi3019.fr = freeze <4 x i1> %vec.phi3019
    1999 ADD DEFERRED:   %123 = fcmp reassoc nsz arcp contract afn olt <4 x double> %wide.load3023, %broadcast.splat3015
    2000 ADD DEFERRED:   %wide.load3023.fr = freeze <4 x double> %wide.load3023
    2001 ADD DEFERRED:   %126 = or <4 x i1> %vec.phi3018, %122
    2002 ADD DEFERRED:   %vec.phi3018.fr = freeze <4 x i1> %vec.phi3018
    2003 ADD DEFERRED:   %122 = fcmp reassoc nsz arcp contract afn olt <4 x double> %wide.load3022, %broadcast.splat3015
    2004 ADD DEFERRED:   %wide.load3022.fr = freeze <4 x double> %wide.load3022
    2005 IC: Replacing   %133 = freeze i1 %132
    2006     with   %132 = icmp ne i4 %131, 0
    2007 IC: pushFreezeToPreventPoisonFromPropagating SUCCESS  worklist=764 deferred=26
    
  • Some calls to pushFreezeToPreventPoisonFromPropagating cause the Worklist to increase significantly, looking at the debug output for the 5s case the worst one increases the Worklist by 260 instructions:
    4299905 IC: Visiting:   %385 = bitcast <2 x i1> %384 to i2
    4299906 IC: Visiting:   %386 = icmp ne i2 %385, 0
    4299907 IC: Visiting:   %384 = or <2 x i1> %broadcast.splat2525, %383
    4299908 IC: Visiting:   %387 = freeze i1 %386
    4299909 IC: pushFreezeToPreventPoisonFromPropagating ENTER  worklist=12 deferred=0 freeze=  %387 = freeze i1 %386
    ...
    4300170 IC: Replacing   %387 = freeze i1 %386
    4300171     with   %386 = icmp ne i2 %385, 0
    4300172 IC: pushFreezeToPreventPoisonFromPropagating SUCCESS  worklist=12 deferred=260
    
  • I found reverting the change to pushFreezeToPreventPoisonFromPropagating in [InstCombine] Allow freezing multiple operands #154336 to emit the freezes in a more natural order fixes the hang:
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 6004feb1d885..51739e818d59 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -5041,7 +5047,7 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
       continue;

     // reverse() to emit freezes in a more natural order.
-    for (Use &Op : reverse(I->operands())) {
+    for (Use &Op : I->operands()) {
       Value *OpV = Op.get();
       if (isa<MetadataAsValue>(OpV) || isGuaranteedNotToBeUndefOrPoison(OpV))
         continue;

although I'm not sure why that's the case or if it's sensible.

@c-rhodes
Copy link
Collaborator Author

c-rhodes commented Oct 7, 2025

I found reverting the change to pushFreezeToPreventPoisonFromPropagating in [InstCombine] Allow freezing multiple operands #154336 to emit the freezes in a more natural order fixes the hang:

diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 6004feb1d885..51739e818d59 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -5041,7 +5047,7 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
       continue;

     // reverse() to emit freezes in a more natural order.
-    for (Use &Op : reverse(I->operands())) {
+    for (Use &Op : I->operands()) {
       Value *OpV = Op.get();
       if (isa<MetadataAsValue>(OpV) || isGuaranteedNotToBeUndefOrPoison(OpV))
         continue;

@nikic any ideas here? Not sure if this is an acceptable fix.

@nikic
Copy link
Contributor

nikic commented Oct 10, 2025

@nikic any ideas here? Not sure if this is an acceptable fix.

I don't think so. This probably just fixes it by accident, by changing worklist order or something.

Something that may be worth doing (also independently of this PR) is teaching isGuaranteedNotToBePoison about the splat pattern. It includes two poison values, but only the poison-ness of the splatted value actually matters (plus some complications due to FMF probably).

Doing this might reduce the number of freezes we generate and the eliminate again. Though it doesn't really address whatever the underlying problem here is.

@nikic
Copy link
Contributor

nikic commented Oct 10, 2025

I think the problem here is similar to what I ran into with the original change in #154336, just for phi nodes. For cyclic phis, we only push the freeze to that phi, and then it will get pushed through in a separate transform. This kind of incrementally pushing freeze (instead of freezing the whole graph) ends up being slow in some cases, though I can't say I fully understand what causes it in this specific case.

Probably if we want to push through freeze we need to integrate the recurrence and non-recurrence cases...

@c-rhodes
Copy link
Collaborator Author

Probably if we want to push through freeze we need to integrate the recurrence and non-recurrence cases...

Do you mean perhaps doing this as part of #155638 or going even further and integrating foldOpIntoPhi / foldFreezeIntoRecurrence?

@nikic
Copy link
Contributor

nikic commented Oct 10, 2025

Probably if we want to push through freeze we need to integrate the recurrence and non-recurrence cases...

Do you mean perhaps doing this as part of #155638 or going even further and integrating foldOpIntoPhi / foldFreezeIntoRecurrence?

Integrating both functions is what I had in mind, but I'm not sure how hard that would be...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants